Evolutionary insights on critically endangered Kashmir red deer or hangul (Cervus hanglu hanglu) through a mitogenomic lens

Background The Kashmir red deer or Hangul (Cervus hanglu hanglu) is the only Tarim red deer species endemic to India. With a current estimated population size of fewer than 200 individuals, this critically endangered species is confined to the greater Dachigam landscape in Jammu and Kashmir. Poaching, habitat loss and fragmentation, resource competition with livestock, and small population size are the major conservation challenges for this species. Methods Blood sampling was conducted from two wild Hangul individuals during radio-collaring operations at Dachigam National Park, Kashmir in 2013 and 2020, respectively. Using next-generation sequencing approach, we sequenced the 16,351 bp long mitogenome of two wild-caught Hangul individuals (1 M:1 F at ~14× and ~10× coverage, respectively) from Dachigam National Park. Results The annotated sequences were identical with an AT-rich composition, including 13 protein-coding genes (11,354 bp), 22 tRNA genes (1,515 bp), two ribosomal genes (2,526 bp) and a non-coding control region (917 bp) in a conserved order like other red deer species. Bayesian phylogenetic reconstruction of the red deer complex revealed two major groups: the elaphoid and the wapitoid clades. Hangul formed a distinct clade with its other subspecies C. hanglu yarkandensis and is sister to the Hungarian red deer (C. elaphus hippelaphus). Divergence time analyses suggested that the Tarim deer species group separated ~1.55 Mya from their common ancestors and Hangul diverged ~0.75 Mya from closely related C. yarkandensis, corroborating with the known paleobiogeographic events related to refugia during glaciations in the Pleistocene era. This study provides baseline information on Hangul mitogenome for further research on phylogeography and other population parameters and helps in developing suitable conservation plans for this species.


INTRODUCTION
The Kashmir red deer or Hangul (Cervus hanglu hanglu) is the only member of the Tarim red deer complex that is endemic (Brook et al., 2017) to the Indian subcontinent.Historically distributed across high altitudes of Kashmir, valleys of Chenab River and upper Himachal Pradesh, this subspecies is currently confined to the greater Dachigam landscape in the north-east of Srinagar, Jammu and Kashmir (Ahmad, Sathyakumar & Qureshi, 2009;Ahmad & Nigam, 2014) with only viable population found in Dachigam National Park (Ahmad, Sathyakumar & Qureshi, 2009;Ahmad & Nigam, 2014;Mukesh et al., 2015).With a current global population size of <200 individuals (reduced from ~3,000-5,000 animals during 1990s) and decreasing trend, Hangul is currently considered as 'Critically endangered' by IUCN (under C1 criteria) (Brook et al., 2017) and 'Schedule 1' under the Indian Wildlife (Protection) Act, 1972 (i.e., the highest protection status).The major concerns for this subspecies across its distribution are small population size, low fawn-female ratio, reduced calf survival (biological factors) and poaching, habitat loss and fragmentation, and competition for resources from livestock (anthropogenic factors) (Ahmad, Sathyakumar & Qureshi, 2009;Ahmad & Nigam, 2014;Mukesh et al., 2015).
Phylogenetically Hangul belongs to the red deer complex which has been extensively studied because of their global distribution (Ludt et al., 2004;Lorenzini & Garofalo, 2015;Kumar et al., 2017;Meiri et al., 2018;Mackiewicz et al., 2022).Recent genetic analyses suggest existence of two red deer groups: the eastern/Wapiti (consisting of species distributed across eastern Asia and North America) and western/Tarim deer (species ranging across Europe, North Africa, Middle East, and central Asia) (Lorenzini & Garofalo, 2015;Kumar et al., 2017;Meiri et al., 2018;Mackiewicz et al., 2022).Hangul, despite being genetically identified as a separate group within this species complex till date has not been studied in detail.While information on population status, distribution and conservation challenges is available (Ahmad, Sathyakumar & Qureshi, 2009;Ahmad & Nigam, 2014;Mukesh et al., 2015), genetic analyses related to their phylogenetic position and divergence is confined to single marker-based studies, single species within group (Cervus hanglu yarkandensisr) or inconclusive (Ludt et al., 2004;Lorenzini & Garofalo, 2015;Kumar et al., 2017;Meiri et al., 2018;Hu et al., 2019;Doan et al., 2022;Mackiewicz et al., 2022), possibly due to lack of sufficient data.In this study, we present next generation sequencing based whole mitogenome data and combine the information with already available genetic data to assess phylogenetic position of Hangul among central Asian red deer clade and its sister groups.In addition, we also provide insights on its divergence period and compare our results with available information.We believe that such detailed analyses with mitogenome would help us in generating valuable information on its evolution, genetic health and therefore assist in developing appropriate conservation plans for the isolated and small populations of this species.

Sampling
Apparently healthy adult Hangul (n = 2, 1 M:1 F) individuals captured for satellite collaring were sampled in this study.The male was captured in 2013 (estimated weight of ~150 kg) whereas the female was captured in 2020 (estimated weight of ~120 kg).In both cases, the animals were lured into an open forest patch using green fodder and salt and were chemically immobilized using a combination of medetomidine (10 mg/ml, Wildlife Pharmaceuticals Inc., Fort Collins, CO, USA) and ketamine hydrochloride (100 mg/ml, Troy laboratories Pvt. Ltd., Glendenning, NSW, Australia) at dose rates of 90 mg kg −1 and 1.45 mg kg −1 for male and 120 mg kg −1 and 2.08 mg kg −1 for female, respectively (Jalanka, 1988;Jalanka & Roeken, 1990).Drug delivery was carried out remotely from a hide using Dan-inject syringe projector (Model-IM) from a distance of 30-40 m.The animals were approached after achieving sedation safe for handling, and blood sampling was conducted.Blood was collected aseptically (5 ml from each animal) by jugular vein puncture in EDTA BD Vacutainer (BD Diagnostics, Franklin Lakes, NJ, USA) and transported to the laboratory within 2 h and stored at −20 C till further processing.Drug reversal was assisted by administering 5 mg/ml atipamezole (Dechra, Northwich, United Kingdom) intramuscularly at a dose rate of 0.2 mg kg −1 body weight following completion of necessary procedures.

DNA extraction and mitogenome sequencing
Mitogenome sequencing of Hangul was performed using next-generation sequencing (NGS) approach.Genomic DNA was extracted from two blood samples using MN NucleoSpin Blood kit (Macherey-Nagel, Duren, Germany) following manufacturers' instructions.DNA quality was checked through 1% agarose gel electrophoresis and Qubit 3.0 fluorometer (Invitrogen, Life Technologies, Waltham, MA, USA).Following this, paired-end libraries were prepared using NEBNext Ultra DNA Library Prep kit (New England Biolabs, Ipswich, MA, USA) where the genomic DNA was enzymatically fragmented, end-polished, A-tailed, and ligated with full length Illumina sequencing adapters.The PCR amplification was conducted using universal P5 and indexed P7 oligos with following conditions: initial denaturation at 98 C for 30 s, followed by four cycles of denaturation (98 C for 10 s), annealing (65 C for 75 s) and final extension (65 C for 5 min).The constructed libraries were checked for purity, size, and concentration using Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA).Finally, the qualified libraries (2 × 150 bp) were sequenced on Illumina NovaSeq6000 (Illumina, San Diego, CA, USA) platform at the Next Generation Genomics Facility of Nucleome Informatics Pvt.Ltd, Hyderabad, India.The raw sequence reads were screened using FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) for quality assurances and further analysed using FastP ver 0.20.1 (Chen et al., 2018) to remove the adapter sequences, low-quality reads (Q < 20) and reads with <30 bp length.After quality filtering, high-quality reads were assembled using animal mtDNA database sequences integrated in GetOrganelle ver 1.7.5 (Jin et al., 2020).The process involved following steps: (a) mapping reads to seed sequence against database using Bowtie 2 (Langmead & Salzberg, 2012), (b) conducting de novo assembly using SPAdes with different k-mer values (Bankevich et al., 2012), and (c) accurate identification of target organelle contigs.The assembled mitogenomes were validated by mapping against the red deer (Cervus elaphus) mitogenome (CerEla 1.0, OU343111.1).
A two-parameter substitution model (GTR+G, decided using jModelTest (Darriba et al., 2012)) along with a gamma distribution of evolutionary rates (across sites), and default shape parameter setting were used in the phylogenetic done via MrBayes v 3.2.7 (Ronquist et al., 2012).The MCMC parameters incorporated two runs of four chains (1 million generations each with sampling after 1,000 generations) till split frequencies were less than 0.01.For each node posterior probabilities were calculated.To estimate clade-specific divergence of Cervus hanglu, birth-death speciation was considered as tree prior with uncorrelated relaxed log normal clock (Ritchie, Lo & Ho, 2017;Steiner, Houck & Ryder, 2018) in BEAST v.2.3.0 (Drummond & Rambaut, 2007).During analysis, four fossil based internal node calibration points were employed with normal distribution priors: (i) root age for the split of Cervidae and Bovidae at 18 ± 2 Mya, (ii) Cervini-Muntiacini split around 9 ± 1 Mya, and (iii) oldest fossil of tribe Odocoileini to be 5 ± 1 Mya (Zhang & Zhang, 2012).The Cervini group was considered under monophyletic constraints.MCMC runs for tMRCA (the Most Recent Common Ancestor) inference involved 100 million generations, sampled at every 10,000 states with 10% burn-in.Data convergence was analysed using Tracer v. 1.5 (Rambaut & Drummond, 2007) and the divergence time tree which showed maximum clade credibility was inferred with TreeAnnotator (Helfrich et al., 2018) and visualised in FigTree v.1.4.2 (Rambaut, 2009).To assess the effects of data size in determining the patterns of phylogenetic relationship and divergence time the entire analyses were conducted using two additional data set: (a) with complete Cyt b and (b) complete Control region as these two markers have earlier been used in Hangul phylogeny work.

Mitogenome organisation
A total of 215.84 million (female sample) and 284.22 million (male sample) sequence reads were generated through NGS approach.For mitogenome assembly, following paired and unpaired mapped reads were used: female-50,466 and 3,070; male-114,948 and 5,152.
After data filtering, 213.7 million (female, 99.02%) and 281.78 million (male, 99.1%) of quality reads were obtained.The assembly lead to ~10Â (female, 29.8 GB data) and ~13.74Â (male, 39.3 GB data) coverage of Hangul mitogenome.The assembled mitogenome of Hangul generated in this study was of 16,351 bp length and matched with the available Hangul and other species reference sequences.The GenBank accession numbers for the sequences are ON416884 (female) and ON416885 (male), respectively.The annotated Hangul mitogenome showed conserved gene order like other red deer species i.e. protein-coding genes (PCGs, n = 13), tRNA genes (n = 22), ribosomal genes (n = 2) and a non-coding control region (Table 2).Out of these, 29 genes are encoded on the heavy strand (H-strand) and remaining nine (NADH6 and eight tRNA genes: tRNA Gln , tRNA Asn , tRNA Ala , tRNA Cys , tRNA Tyr , tRNA Ser , tRNA Glu and tRNA Pro ) are encoded on the light strand (L-strand).The number of overlapping regions (n = 13) and intergenic spaces (n = 15) were also consistent with other red deers (Table 2).NADH1, NADH2, NADH4L, NADH5, two cytochrome oxidases (COII and COIII), ATP8 and ATP6 and O L, showed overlaps ranging from 1 to 40 bp.The tRNA Leu gene showed the largest overlap (9 bp) with NADH5 whereas three tRNA genes (tRNA Ile , tRNA Thr , tRNA Gly ) showed the smallest overlap (1-3 bp) (Table 2).Similarly, the intergenic spaces ranged between 1-7 bp, with the longest between tRNA Ser and tRNA Asp , respectively.The O L was 31 bp long and found in the WANCY region between tRNA Asn and tRNA Cys .
Overall, the nucleotide composition comprised of 28.8% T, 24.5% C, 33.3%A and 13.5% G, with a positive AT value (0.072) (and a negative GC value of −0.289), indicating a AT-rich Hangul mitogenome (Table 1).

Protein coding genes
Hangul mitochondrial protein-coding genes (PCGs) was 11,354 bp in length with an AT-skewed base composition of 29.7% T, 25.9% C, 32.3%A and 12.2% G (Table 1).Out of the 13 coding genes majority (n = 12) were on the H-stand.These included six NADH genes (NADH1, NADH2, NADH3, NADH4, NADH5, NADH6 and NADH4L), ATPases (ATP6 and ATP8), cytochrome c oxidases (COI, COII and COIII) and cytochrome b (Cyt b).NADH4L was coded on the L-strand consistent with mitogenome organisation of other red deer species.NADH2, NADH3 and NADH5 have ATA start codon compared to others that have ATG.Only Cyt b (AGA), NADH2 (TAG), NADH4 and COIII had incomplete stop codon (T_ _) at the 5′ terminal of the adjacent gene to be completed during post transcription modification i.e. polyadenylation process.The Relative Synonymous Codon Usage (RSCU) for all PCGs consist of 3,784 codons.

rRNA, tRNA genes and control region
Hangul mitogenome has two rRNA genes: 12s rRNA (located between tRNA Phe and tRNA Val ) and 16s rRNA (between tRNA Val and tRNA Leu ) (Fig. 1).The total length of these genes is 2,526 bp with an average positive AT skew (0.218) and negative GC skew (−0.097) (Table 1).In addition, it also contains 22 tRNA genes (total length of 1,515 bp) with length varying from 60 bp (tRNA Ser ) to 75 bp (tRNA Leu ) (Table 1).Most of these tRNAs (n = 14) are located on H-strand and majority (n = 21) showed the typical secondary cloverleaf structures.Leucine and Serine have two tRNAs each and tRNA Ser (GCT) lacks the D-paired arm.The Control region (CR) is 917 bp long and located between tRNA Pro and tRNA Phe .The base composition of CR was 31.7%T, 23.6% C, 29.4% A and 15.3% G with average negative AT (−0.038) and GC (−0.213) skew (Table 1).(Kurten, 1968;Kuwayama & Ozawa, 2000).At Hangul group level, results suggest a coalescence of 0.75 Mya (HPD 1.51-0.18Mya) between C. hanglu and C. yarkandensis.

Phylogenetic analysis and divergence time estimate
The Hangul sequences generated from the male and female individuals were almost identical with five variable sites (n = 3 in PCGs and n = 2 in noncoding region).However, the Hangul mitogenome from GenBank showed 15 variable sites (n = 6 in the coding and n = 9 in the non-coding regions with three insertions in the rRNA region) with the sequences generated in this study.The mitogenomes of Hangul and closely related Cervus h.yarkandensis showed 345 variable sites.The highest difference was found with Cervus n. yakushimae (795 variable sites), but during annotation it was found that part of the mitogenome was missing and therefore was removed from further genetic distance analyses.While the protein coding regions among all the deer remained consistent (11,354 bp), major differences were found in the non-coding regions which varied from 913 bp for Cervus h.yarkandensis to 1,231 bp for Cervus n. centralis.One to seven base pair differences were seen for both tRNA (1,515-1,517 bp) and rRNA (2,525-2,532 bp) regions across the species (Table 1).As expected, the Hangul mitogenomes were genetically much closer (genetic distance = 0.001) than all other red deer species (genetic distance value varied from 0.022 to 0.052) (Table S1).

DISCUSSION
The Ministry of Environment, Forests and Climate Change, Government of India has listed Hangul as one of the 22 important Indian species that require immediate attention for future survival and recovery (https://wiienvis.nic.in/Database/SRP_8555.aspx).As part of this long-term initiative, the government has initiated a species recovery plan for this critically endangered species, especially focusing on creating a second population outside Dachigam national park (largest viable population), Jammu and Kashmir and establishing a conservation breeding facility.However, the conservation breeding program is facing challenges in selecting appropriate founder animals from the existing small and isolated hangul population in Kashmir (Ahmad & Nigam, 2014;Mukesh et al., 2015), specifically with limited knowledge on certain aspects of its biology including the patterns of genetic variation and evolutionary history.Earlier Hangul phylogeny research indicated contradictory patterns of its relationship with other species of the red deer group, mostly due to lack of data from Hangul.Comparative analyses of all published research on the species phylogeny suggest different outcomes.For example, Cytochrome b, control regions and whole mitogenome based phylogenetic research have shown similar sister-group relationships between the European red deer and Hangul (Ludt et al., 2004;Lorenzini & Garofalo, 2015;Meiri et al., 2018;Doan et al., 2018;Olivieri et al., 2014;Mackiewicz et al., 2022).However,  (Heckeberg, 2020).Our complete mitogenome-based phylogeny supports the inferences arrived from the cytochrome b-based phylogeny as well as other studies involving four species of red deer (C.nippon, C. elaphus, C. canadensis and C. hanglu) (Zhang & Zhang, 2012;Frank et al., 2016;Li, Ba & Yang, 2016;Świsłocka et al., 2020;Mackiewicz et al., 2022).Given the support from the other studies, we believe that the mitogenome-based phylogeny of Hangul provides a clear relationship status within the Tarim red deer group.One noteworthy point is the non-availability of whole mitogenome sequence of C. h.bactrianus in this study, and when the mitogenome information is available it will provide the complete picture for the group's phylogenetic relationship.
Another important outcome of this study is the assessment of Hangul divergence time within the existing Tarim red deer members (except C. h.bactrianus).The divergence period analyses are dependent on various factors such as molecular marker, mutation rate, node calibration points etc., Ritchie, Lo & Ho (2017).Overall, the red deer group divergence period varied between 7-1 Mya across different studies (Doan et al., 2022), probably due to differences in marker selection and analytical settings.For example, mitogenome-based divergence time estimation between European red deer and Tarim red deer (C.h.Yarkandensis) was found to be ~1.65 Mya (Frank et al., 2016) and ~1.88 Mya (Mackiewicz et al., 2022).Doan et al. (2018) reported a divergence period of ~1.54 Mya between the western (consisting of C. elaphus and C. hanglu) and the eastern (C.nippon and C. canadensis) red deer groups, whereas Zhang & Zhang (2012) along with Mackiewicz et al. (2022) suggested that ~2 Mya divergence period between the same groups.Our analyses provide a similar divergence period of ~2.36 Mya (4.03-1.06Mya) between western and eastern groups of red deer.Within the western clade the Tarim deers diverged Mitogenome-based Bayesian phylogenetic reconstruction of the red deer complex (with 18 available species sequences including outgroups) reveals four mitochondrial lineages: C. elaphus, C. hanglu, C. canadensis and C. nippon.These lineages can be broadly divided into two major groups: the elaphoid clade (C.hanglu and C. elaphus) and the wapitoid clade (C.canadensis and C. nippon).The C. elaphus and C. hanglu forms the basal clade, followed by the C. canadensis and C. nippon as sister groups.The Hangul (C.hanglu hanglu) sequences (both generated in this study and the already available one) formed a distinct clade with its other subspecies C. hanglu yarkandensis and sister to the Hungarian red deer (C.elaphus hippelaphus).The tMRCA analysis suggested a divergence period spanning from 2.36 (Height Posterior Density (HPD) 4.23-1.06 million years ago (Mya)) to 0.16 (HPD 0.35-0.03Mya) Mya (Fig.2) for the red deer species complex.Our results indicated the divergence of the elaphoid group from the wapitoid group ~2.36 Mya, corresponding to the early Pleistocene period(Kuwayama & Ozawa,

Figure 1
Figure 1 Complete mitochondrial genome of Hangul (Cervus hanglu hanglu) with location of genes.The genes in the heavy strands are shown outside.The inner grey circle depicts the GC content on each strand.Full-size  DOI: 10.7717/peerj.15746/fig-1

Figure 2
Figure 2 Phylogenetic relationship and assessment of divergence time in red deer species group.The right pane shows the phylogenetic relationship among different red deer species.For all nodes posterior probability values were ≥0.9.Three bovid species were used as outgroups.The left pane indicates the divergence times of this entire group.Node-specific ages are marked (with posterior probability values ≥0.9) along with the HPD intervals.Full-size  DOI: 10.7717/peerj.15746/fig-2

Table 1
Nucleotide composition comparison with other red deer species.
Meiri et al. (2018))omplete cytochrome b)andKumar et al. (2017, partial  cytochrome b)(Kumar et al., 2017)have inferred Hangul as a basal group to the entire red deer complex.The most informative phylogenetic relationship among the Tarim red deers were presented byLorenzini & Garofalo (2015)(cytochrome b and partial control region) andMeiri et al. (2018)(partial cytochrome b and control region) where all the species (C.h.hanglu, C. h.bactrianus and C. h.Yarkandensis) showed sister-group relationship with European red deer (Elaphoid group).Other studies involving C. h.Yarkandensis (closely related to C. h.hanglu and C. h.bactrianus) also showed similar relationship with the Elaphoid group (Doan et al., 2018; Mackiewicz et al., 2022).Based on genomic SNP markers, Hu et al. (2019) suggested a different phylogenetic relationship where Tarim deer (represented by C. h.Yarkandensis) formed a sister clade with Cervus canadensis, which was later explained as a result of frequent hybridisation amongst the Cervus group members